01. Introduction

Introduction

L1 01 Intro WIP V1

Why Learn Spark?

Spark is currently one of the most popular tools for big data analytics. You might have heard of other tools such as Hadoop. Hadoop is a slightly older technology although still in use by some companies. Spark is generally faster than Hadoop, which is why Spark has become more popular over the last few years.

There are many other big data tools and systems, each with its own use case. For example, there are database system like Apache Cassandra and SQL query engines like Presto. But Spark is still one of the most popular tools for analyzing large data sets.

Here is an outline of the topics we are covering in this lesson:

  • What is big data?
  • Review of the hardware behind big data
  • Introduction to distributed systems
  • Brief history of Spark and big data
  • Common Spark use cases
  • Other technologies in the big data ecosystem